Imports

In [118]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import mplfinance as mpf
import matplotlib.dates as mdates
import datetime as dt

import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
from plotly.subplots import make_subplots
pio.renderers.default = "notebook"
pio.templates.default = "plotly_dark"
import gc
In [2]:
import warnings
warnings.filterwarnings('ignore')
In [3]:
plt.rcParams['figure.figsize'] = [12, 8]

Topics cum Notes

L1

topics

A brief list of topics are:

  • Risk
  • Insurance
  • Diversification
  • History of Finance
  • Innovation
  • Behavioral Finance
  • Debt
  • Stocks
  • Real Estate
  • Regulation
  • Banking
  • Futures
  • Monetary Policy
  • Endowment Management
  • Investment Banking
  • Options
  • Money Managers
  • Exchanges
  • Public Finance
  • Nonprofit Finance
  • Purpose of Finance

L2

Var Variance/Value at Risk

Value at risk is a measure used by some finance people to quantify risk of of an investment or of a portfolio and it's quoted in units of dollars for a given probability and time horizon. For example, if it says lets's say 1%, one-year value at risk of 10 million, it means that there is a 1% chance that the portfolio will lose 10 million in one year.

Stress Testing

The stress test is a test usually ordered by government to see how some firm will stand up to a financial crisis.

S&P 500

In [4]:
snp = pd.read_csv('Data/GSPC.csv', parse_dates=['Date'], index_col='Date')
snp.head()
Out[4]:
Open High Low Close Volume
Date
1927-12-30 17.660000 17.660000 17.660000 17.660000 0.0
1928-01-03 17.760000 17.760000 17.760000 17.760000 0.0
1928-01-04 17.719999 17.719999 17.719999 17.719999 0.0
1928-01-05 17.549999 17.549999 17.549999 17.549999 0.0
1928-01-06 17.660000 17.660000 17.660000 17.660000 0.0
In [5]:
px.line(snp, x=snp.index, y='Close', title='S&P 500')

Beta

Beta gives a measure of how much a stock moves in relation to the market. A $\beta$ of 2 means that the stock moves twice as much as the market. A $\beta$ of 0.5 means that the stock moves half as much as the market.

In [49]:
apple = pd.read_csv('Data/AAPL.csv', parse_dates=['Date'], index_col='Date')
google = pd.read_csv('Data/GOOG.csv', parse_dates=['Date'], index_col='Date')
apple.head()
Out[49]:
Open High Low Close Volume
Date
1980-12-12 00:00:00-05:00 0.099874 0.100308 0.099874 0.099874 469033600
1980-12-15 00:00:00-05:00 0.095098 0.095098 0.094663 0.094663 175884800
1980-12-16 00:00:00-05:00 0.088149 0.088149 0.087715 0.087715 105728000
1980-12-17 00:00:00-05:00 0.089886 0.090320 0.089886 0.089886 86441600
1980-12-18 00:00:00-05:00 0.092492 0.092927 0.092492 0.092492 73449600
In [95]:
def merge_two_stocks(df1:pd.DataFrame, df2:pd.DataFrame, names=["df1", "df2"], columns=None, date_too=True)->pd.DataFrame:
    """
    Merge two stocks together on index (Assumes index is date)

    Parameters
    ----------
    df1 : pd.DataFrame
        First dataframe
    df2 : pd.DataFrame
        Second dataframe
    names : list, optional
        Names of the two dataframes (Stock names, suffix will be decided by it), by default ["df1", "df2"]
    columns : list, optional
        Columns to merge, by default None
    date_too : bool, optional
        Whether to include the date column, by default True

    Returns
    -------
    pd.DataFrame
        Merged dataframe
    """
    df1 = df1.copy()
    df2 = df2.copy()
    if columns:
        df1 = df1[columns]
        df2 = df2[columns]
    df1.index = pd.Series(df1.index).apply(lambda x: x.strftime("%Y-%m-%d"))
    df2.index = pd.Series(df2.index).apply(lambda x: x.strftime("%Y-%m-%d"))
    df = df1.merge(
        df2,
        how="inner",
        left_index=True,
        right_index=True,
        suffixes=("_" + names[0], "_" + names[1]),
    )
    if date_too:
        df.index = pd.to_datetime(df.index)
        df["Date"] = df.index
    if len(columns) == 1 and date_too:
        df.columns = [names[0], names[1], "Date"]
    elif len(columns) == 1 and not date_too:
        df.columns = [names[0], names[1]]
    return df
In [96]:
apple_google = merge_two_stocks(apple, google, names=["Apple", "Google"], date_too=False, columns=["Open", "Close"])
apple_snp = merge_two_stocks(apple, snp, names=["Apple", "S&P 500"], columns=['Close'])
In [97]:
apple_snp.head()
Out[97]:
Apple S&P 500 Date
Date
1980-12-12 0.099874 129.229996 1980-12-12
1980-12-15 0.094663 129.449997 1980-12-15
1980-12-16 0.087715 130.600006 1980-12-16
1980-12-17 0.089886 132.889999 1980-12-17
1980-12-18 0.092492 133.000000 1980-12-18
In [98]:
apple_google.head()
Out[98]:
Open_Apple Close_Apple Open_Google Close_Google
Date
2004-08-19 0.479638 0.467460 2.490664 2.499133
2004-08-20 0.467460 0.468830 2.515820 2.697639
2004-08-23 0.469743 0.473092 2.758411 2.724787
2004-08-24 0.475832 0.486335 2.770615 2.611960
2004-08-25 0.485117 0.503079 2.614201 2.640104
In [99]:
#plot apple and snp with different y axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(
    go.Scatter(x=apple_snp.Date, y=apple_snp['Apple'], name="Apple"),
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=apple_snp.Date, y=apple_snp['S&P 500'], name="S&P 500"),
    secondary_y=True,
)
# Set figure title
fig.update_layout(
    title_text="Apple vs S&P 500"
)

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>primary</b> S&P 500", secondary_y=False)
fig.update_yaxes(title_text="<b>secondary</b> Apple", secondary_y=True)

Beta could be calculated by first dividing the security's standard deviation of returns by the benchmark's standard deviation of returns. The resulting value is multiplied by the correlation of the security's returns and the benchmark's returns. Mathematically, the formula is: $$ \beta = \frac{\sigma_{s}}{\sigma_{b}} \times \rho_{s,b} $$ where $\sigma_{s}$ is the standard deviation of the security's returns, $\sigma_{b}$ is the standard deviation of the benchmark's returns, and $\rho_{s,b}$ is the correlation between the security's returns and the benchmark's returns.

In [100]:
apple_snp["Apple_returns"] = apple_snp["Apple"].pct_change()
apple_snp["S&P_returns"] = apple_snp["S&P 500"].pct_change()

apple_snp.dropna(inplace=True)
In [101]:
apple_snp
Out[101]:
Apple S&P 500 Date Apple_returns S&P_returns
Date
1980-12-15 0.094663 129.449997 1980-12-15 -0.052171 0.001702
1980-12-16 0.087715 130.600006 1980-12-16 -0.073398 0.008884
1980-12-17 0.089886 132.889999 1980-12-17 0.024751 0.017534
1980-12-18 0.092492 133.000000 1980-12-18 0.028992 0.000828
1980-12-19 0.098137 133.699997 1980-12-19 0.061029 0.005263
... ... ... ... ... ...
2022-12-16 134.509995 3852.360107 2022-12-16 -0.014579 -0.011138
2022-12-19 132.369995 3817.659912 2022-12-19 -0.015910 -0.009008
2022-12-20 132.300003 3821.620117 2022-12-20 -0.000529 0.001037
2022-12-21 135.449997 3878.439941 2022-12-21 0.023809 0.014868
2022-12-22 132.229996 3822.389893 2022-12-22 -0.023773 -0.014452

10597 rows × 5 columns

In [102]:
#plot monthly return
apple_snp_month = apple_snp.asfreq("M", method="ffill")
In [103]:
apple_snp_month.head()
Out[103]:
Apple S&P 500 Date Apple_returns S&P_returns
Date
1980-12-31 0.118546 135.759995 1980-12-31 -0.028468 0.003177
1981-01-31 0.098137 129.550003 1981-01-30 -0.054397 -0.005298
1981-02-28 0.092058 131.270004 1981-02-27 0.034153 0.008993
1981-03-31 0.085110 136.000000 1981-03-31 -0.010101 0.012809
1981-04-30 0.098571 132.809998 1981-04-30 0.017936 -0.001804
In [104]:
# Plot returns
fig = make_subplots()
fig.add_traces(
    [
        go.Scatter(y=apple_snp_month["S&P_returns"], x=apple_snp_month.index,opacity=1.0, name="S&P"),
        go.Scatter(y=apple_snp_month["Apple_returns"], x=apple_snp_month.index,opacity=0.3, name="Apple")
    ]
)
In [147]:
fig = px.scatter(apple_snp_month, x="S&P_returns", y="Apple_returns", trendline="ols", trendline_color_override="red")
fig.update_layout(
    title="Apple vs S&P 500 (Monthly Returns)",
    xaxis_title="S&P 500 Returns",
    yaxis_title="Apple Returns",
    font=dict(
        family="Courier New, monospace",
        size=18,
        color="#7f7f7f"
    )
)
fig.show()

$\beta$ is nothing but the slope of the regression line of the security's returns on the benchmark's returns.

In [148]:
# Calculate beta for apple
covariance = apple_snp_month["Apple_returns"].cov(apple_snp_month["S&P_returns"])
variance = apple_snp_month["S&P_returns"].var()
apple_beta = covariance / variance
print(f"Apple's beta is {apple_beta}")
Apple's beta is 1.3353789585800684

Let's see how this varies year by year.

In [106]:
def clc_beta(year):
    data = apple_snp[apple_snp["Date"].dt.year == year]
    covariance = data["Apple_returns"].cov(data["S&P_returns"])
    variance = data["S&P_returns"].var()
    apple_beta = covariance / variance
    return apple_beta

betas = []
years = np.arange(1980,2023)
for year in years:
    betas.append(clc_beta(year))
In [107]:
fig = px.line(x=years, y=betas)
fig.update_xaxes(title_text="Date")
fig.update_yaxes(title_text=r"Beta for Apple")

Market and Idiosyncratic Risk

Market risk is the risk that affects all stocks in the market. Idiosyncratic risk is the risk that affects only one stock.

Normal and Cauchy Distribution

In [17]:
np.random.seed(42)
normal = np.random.normal(0,1,2000)
cauchy = np.random.standard_cauchy(2000)
distribution = np.array([normal, cauchy]).T
distribution = pd.DataFrame(distribution, columns = ["Normal", "Cauchy"])
distribution
Out[17]:
Normal Cauchy
0 0.496714 4.671910
1 -0.138264 2.573113
2 0.647689 -8.877968
3 1.523030 -0.001475
4 -0.234153 0.703143
... ... ...
1995 1.070150 1.755178
1996 -0.026521 -1.924287
1997 -0.881875 5.036051
1998 -0.163067 -0.946808
1999 -0.744903 -0.399862

2000 rows × 2 columns

In [18]:
fig = make_subplots()
fig.add_traces(
    [
        go.Histogram(x=distribution["Normal"], name="Normal Distribution"),
        go.Histogram(x=distribution["Cauchy"], name="Cauchy Distribution")
    ]
)

We can see that the Cauchy distribution is fait tailed. To see it clearly, let's plot the 'return' of both the distributions.

In [19]:
distribution["Normal_Returns"] = distribution["Normal"].pct_change()
distribution["Cauchy_Returns"] = distribution["Cauchy"].pct_change()
distribution.dropna(inplace=True)
In [20]:
fig = make_subplots()
fig.add_traces(
    [
        go.Scatter(y=distribution["Normal_Returns"], name="Normal Distribution",opacity=0.5),
        go.Scatter(y=distribution["Cauchy_Returns"], name="Cauchy Distribution",opacity=0.5)
    ]
)

A many number of huge spikes shows that in Cauchy distribution, values even very far away from mean has good probability of happening.

In [24]:
distribution.describe()
Out[24]:
Normal Cauchy Normal_Returns Cauchy_Returns
count 1999.000000 1999.000000 1999.000000 1999.000000
mean 0.044858 0.252416 -0.910172 -6.744965
std 0.988661 148.554806 13.780630 148.667582
min -3.241267 -5614.733511 -186.627834 -3817.930112
25% -0.622674 -1.009257 -1.971528 -2.052198
50% 0.043811 0.004358 -1.023974 -0.998176
75% 0.684002 1.003477 0.051385 -0.092917
max 3.852731 3001.984748 320.541185 2619.054166

Central Limit Theorem

The central limit theorem says that the sum of a large number of independent random variables will be approximately normally distributed. (This does not work if the underlining distribution is fait tailed.)

Let's see if the central limit theorem holds for Cauchy distribution.

In [21]:
means = []
for _ in range(1000):
    cauchy = np.random.standard_cauchy(2000)
    means.append(cauchy.mean())
In [22]:
fig = px.histogram(x=cauchy)
fig.update_xaxes(title_text="Value From Cauchy Distribution")
fig.update_yaxes(title_text=r"Count")

CLT is not valid!

Covariance and Correlation

Covarinace between two stocks measures how independent the two stocks are. If the covariance is zero, the two stocks are independent. If the covariance is positive, the two stocks tend to move in the same direction. If the covariance is negative, the two stocks tend to move in opposite directions. Mathematically, covariance is defined as: $$ \operatorname{cov}(X, Y)=\operatorname{E}\left[(X-\mu_{X})(Y-\mu_{Y})\right] $$ where $\mu_X$ and $\mu_Y$ are the means of $X$ and $Y$ respectively.

For example, let's calculate the covariance between the close price of Apple and Google.

In [109]:
# Calculate covariance between apple and google
covariance = apple_google["Close_Apple"].cov(apple_google["Close_Google"])
print(f"The covariance between Apple and Google is {covariance}")
The covariance between Apple and Google is 1553.482172859917

The formula for correlation is: $$ \rho_{X,Y} = \frac{\operatorname{cov}(X,Y)}{\sigma_X \sigma_Y} $$ where $\sigma_X$ and $\sigma_Y$ are the standard deviations of $X$ and $Y$ respectively. Let's calculate the correlation between the close price of Apple and Google.

In [113]:
# Calculate correlation between apple and google
correlation = apple_google["Close_Apple"].corr(apple_google["Close_Google"])
print(f"The correlation between Apple and Google is {correlation}")
The correlation between Apple and Google is 0.9728011568137019

What about the correlation between the returns? Let's calculate it.

In [114]:
apple_google["Return_apple"] = apple_google["Close_Apple"].pct_change()
apple_google["Return_google"] = apple_google["Close_Google"].pct_change()

apple_google.dropna(inplace=True)

corr = apple_google["Return_apple"].corr(apple_google["Return_google"])
print(f"The correlation between Apple and Google is {corr}")
The correlation between Apple and Google is 0.5204530616935059

Great! This correlation is far all the data. Let's calculate this far the past year.

In [117]:
apple_google_last_year = apple_google[apple_google.index > "2022-01-01"]
new_corr = apple_google_last_year["Return_apple"].corr(apple_google_last_year["Return_google"])
print(f"The correlation between Apple and Google is {new_corr}")
The correlation between Apple and Google is 0.7889636213999328

This correlation is close to 1. This means that the two stocks tend to move in the same direction.

In [120]:
corr = apple_snp["Apple_returns"].corr(apple_snp["S&P_returns"])
print(f"The correlation between Apple and S&P is {corr}")
The correlation between Apple and S&P is 0.490176406429941

L3

Insurance

Risk Pooling

Assuming independence, the distribution of clain follows binomial distribution. If there are $n$ policies and each have probability $p$ of claim, the risk of the total claim is $$ \sigma=\sqrt{p(1-p)/n} $$ This means that if $n$ is large, the standard deviation is small. This is the Law of Large Numbers. This is the idea of risk pooling.

In [138]:
binomial = np.random.binomial(100, 0.5, 1000)

fig = px.histogram(x=binomial)
fig.update_xaxes(title_text="Value From Binomial Distribution")
fig.update_yaxes(title_text=r"Count")
fig.show()
In [139]:
mean = binomial.mean()
print(f"The mean of the binomial distribution is {mean}")
The mean of the binomial distribution is 49.91
In [140]:
std = binomial.std()
print(f"The standard deviation of the binomial distribution is {std}")
The standard deviation of the binomial distribution is 5.020746956380097

Moral Hazard and Selection Bias

Moral hazard is the tendency of people to take more risk when they are insured. Selection bias is the tendency of people to buy insurance when they are more likely to have a claim.

Health Insurance

The Health Maintenance Organization

HMO is a type of health insurance that provides health care through a network of doctors and hospitals. The HMO is paid a fixed amount per month for each member. The HMO pays the doctors and hospitals a fixed amount for each service. This way, the doctors have an incentive to keep the patients healthy.

EMTALA (Emergency Medical Treatment and Active Labor Act)

L4

Portfolio Management as an Alternative to Insurance

Risk is inherent in investment.

Portfolio Diversification

All should matter to an investor is the performance of the enitre portfolio. The performance of the individual stocks should not matter. Only the mean and variance of the portfolio should matter.

Hedge Funds

Systematic Risk

It is a risk that affects all stocks in the market. It is also called market risk.

Captial Asset Pricing Model (CAPM)

It's a model of the optimal portfolio. It asserts that all investors will hold the optimal portfolio. But as not everyone holds the optimal portfolio, the model is only the half truth.

The model assumes that everyone is rational. It assumes that nobody has any risks that are inherent to them.

The basic equation of CAPM reads: $$ E(r_i) = r_f + \beta_i (E(r_m) - r_f) $$ where $r_i$ is the return of the stock, $r_f$ is the risk-free rate, $\beta_i$ is the beta of the stock, and $E(r_m)$ is the expected return of the market.

What is says is this: the expected return of a stock is the risk-free rate plus the beta of the stock times the expected return of the market minus the risk-free rate.

What is the risk-free rate? It is the return of a risk-free asset. For example, the return of a 10-year US Treasury bond.

Short Sales

Holding negative shares of a stock is called short selling. It is a way to bet against a stock. For example, if you think that a stock will go down, you can short sell it. If you are right, you will make money. If you are wrong, you will lose money.

This works by borrowing the stock from someone and selling it. Then you buy the stock back at a lower price and return it to the original owner. The difference between the two prices is your profit. Usually, the broker will lend you the stock at a small fee.

In CAMP model, short selling is allowed however, we must assume that on average this is negligible. Because if it is great, everyone will do it and the problem will arise that who will lend the stock.

Calculating the Optimal Portfolio

Efficient Portfolio Frontier

The efficient portfolio of frontier expresses the standard deviation of the portfolio in terms of $r$ the expected return on the portfolio instead of $x_1$.

Gordon Growth Model

If a company has a constant growth rate, the value of the company is $$ V = \frac{D_1}{r-g} $$ where $D_1$ is the dividend in the next year, $r$ is the rate of discount, and $g$ is the growth rate.

In terms of a security, say a land which has a constant growth rate, the value of the land is $$ V = \frac{D_1}{r-g} $$ where $D_1$ is the rent in the next year, $r$ is the rate of discount, and $g$ is the growth rate.

This can be calculated by summing up the infinite series: $$ V = \frac{D_1}{1+r} + \frac{D_1(1+g)}{(1+r)^2} + \frac{D_1(1+g)^2}{(1+r)^3} + \cdots $$

$r$ is the risk of the security. The equation estimates the current price of the asset. If the current price is higher than the estimated price, the asset is overvalued. If the current price is lower than the estimated price, the asset is undervalued.